---
title: 'LlamaIndex'
sidebarTitle: 'LlamaIndex'
icon: 'clock'
---

[LlamaIndex](https://llamaindex.ai/) is a framework for building context-augmented generative AI applications with LLMs.
It provides a wide range of functionality including data connectors, index building, query engines, agents, workflows and observability.
Making it easy to build powerful RAG applications.

## Using LlamaIndex and Nile together

LlamaIndex can be used with Nile to build RAG (Retrieval Augmented Generation) architectures.
You'll use LLamaIndex to simply and orchestrate the different steps in your RAG workflows, and Nile to store and query data and embeddings.

In this example, we'll show how to chat with a sales transcript in just a few lines of code, using LlamaIndex's high-level interface and its integration with Nile and OpenAI.

We'll walk you through the setup steps and then explain the code line by line. The entire Python script is available
[here](https://github.com/niledatabase/niledatabase/blob/main/examples/integrations/code_snippets/nile_llamaindex_quickstart.py)
or in [iPython/Jupyter Notebook](https://github.com/niledatabase/niledatabase/blob/main/examples/integrations/code_snippets/nile_llamaindex_quickstart.ipynb).

### Setting Up Nile

Start by signing up for [Nile](https://console.thenile.dev/?utm_campaign=partnerlaunch&utm_source=nilewebsite&utm_medium=docs). Once you've signed up for Nile, you'll be promoted to create your
first database. Go ahead and do so. You'll be redirected to the "Query Editor" page of your new database. You can see the built-in `tenants` table on the left-hand side.

From there, click on "Home" (top icon on the left menu), click on "generate credentials" and copy the resulting connection string. You will need it in a sec.

### Setting Up LlamaIndex

LlamaIndex is a Python library, so you'll need to set up a Python environment with the necessary dependencies.
We recommend using [venv](https://docs.python.org/3/library/venv.html) to create a virtual environment.
This step is optional, but it will help you manage your dependencies and avoid conflicts.

```bash
python3 -m venv llama-env
source llama-env/bin/activate
```

Once you've activated your virtual environment, you can install the necessary dependencies - LlamaIndex and the Nile Vector Store:

```bash
pip install llama-index llama-index-vector-stores-nile
```

### Setting up the data

In this example, we'll chat with sales transcript of two different companies. Download the transcripts to `./data` directory.

```bash
mkdir -p data/
wget "https://raw.githubusercontent.com/niledatabase/niledatabase/main/examples/ai/sales_insight/data/transcripts/nexiv-solutions__0_transcript.txt" -O "data/nexiv-solutions__0_transcript.txt"
wget "https://raw.githubusercontent.com/niledatabase/niledatabase/main/examples/ai/sales_insight/data/transcripts/modamart__0_transcript.txt" -O "data/modamart__0_transcript.txt"
```

### Setting up the OpenAI API key

This quickstart uses OpenAI's API to generate embeddings. So grab your [OpenAI API key](https://platform.openai.com/api-keys) and set it as an environment variable:

```bash
export OPENAI_API_KEY="your-openai-api-key"
```

## Quickstart

Open a file named `nile_llamaindex_quickstart.py` and start by importing the necessary dependencies (or follow along with the script mentioned above):

```python
import logging

logging.basicConfig(level=logging.INFO)

from llama_index.core import SimpleDirectoryReader, StorageContext
from llama_index.core import VectorStoreIndex
from llama_index.core.vector_stores import (
    MetadataFilter,
    MetadataFilters,
    FilterOperator,
)
from llama_index.vector_stores.nile import NileVectorStore, IndexType
```

### Setting up the NileVectorStore

Next, create a NileVectorStore instance:

```python
vector_store = NileVectorStore(
    service_url="postgresql://user:password@us-west-2.db.thenile.dev:5432/niledb",
    table_name="test_table",
    tenant_aware=True,
    num_dimensions=1536,
)
```

Note that in addition to the usual parameters like URL and dimensions, we also set `tenant_aware=True`. This is because we want to isolate the documents for each tenant in our vector store.

🔥 NileVectorStore supports both tenant-aware vector stores, that isolates the documents for each tenant and a regular store which is typically used for shared data that all tenants can access.
Below, we'll demonstrate the tenant-aware vector store.

### Loading and transforming the data

With all this in place, we'll load the data for the sales transcripts. We'll use LlamaIndex's `SimpleDirectoryReader` to load the documents. Because we want to update the documents with the tenant
metadata after loading, we'll use a separate reader for each tenant.

```python
reader = SimpleDirectoryReader(input_files=["nexiv-solutions__0_transcript.txt"])
documents_nexiv = reader.load_data()

reader = SimpleDirectoryReader(input_files=["modamart__0_transcript.txt"])
documents_modamart = reader.load_data()
```

We are going to create two Nile tenants and the add the tenant ID of each to the document metadata. We are also adding some additional metadata like a custom document ID and a category. This metadata can be used for filtering documents during the retrieval process.
Of course, in your own application, you could also load documents for existing tenants and add any metadata information you find useful.

```python
tenant_id_nexiv = str(vector_store.create_tenant("nexiv-solutions"))
tenant_id_modamart = str(vector_store.create_tenant("modamart"))

# Add the tenant id to the metadata
for i, doc in enumerate(documents_nexiv, start=1):
    doc.metadata["tenant_id"] = tenant_id_nexiv
    doc.metadata[
        "category"
    ] = "IT"  # We will use this to apply additional filters in a later example
    doc.id_ = f"nexiv_doc_id_{i}"  # We are also setting a custom id, this is optional but can be useful

for i, doc in enumerate(documents_modamart, start=1):
    doc.metadata["tenant_id"] = tenant_id_modamart
    doc.metadata["category"] = "Retail"
    doc.id_ = f"modamart_doc_id_{i}"
```

We are loading all documents to the same `VectorStoreIndex`. Since we created a tenant-aware `NileVectorStore` when we set things up,
Nile will correctly use the `tenant_id` field in the metadata to isolate them. Loading documents without `tenant_id` to a tenant-aware store will throw a `ValueException`.

```python
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    documents_nexiv + documents_modamart,
    storage_context=storage_context,
    show_progress=True,
)
```

### Chatting with the documents

Now that we have our vector embeddings stored in Nile, we can build a query engine for each tenant and chat with the documents:

```python
nexiv_query_engine = index.as_query_engine(
    similarity_top_k=3,
    vector_store_kwargs={
        "tenant_id": str(tenant_id_nexiv),
    },
)

print(nexiv_query_engine.query("What were the customer pain points?"))

modamart_query_engine = index.as_query_engine(
    similarity_top_k=3,
    vector_store_kwargs={
        "tenant_id": str(tenant_id_modamart),
    },
)

print(modamart_query_engine.query("What were the customer pain points?"))
```

And run the script:

```bash
python nile_llamaindex_quickstart.py
```

Nexiv is an IT company and Modamart is a retail company. You can see that the query engine for each tenant returns an answer relevant to the tenant.

Thats it! You've now built a (small)RAG application with LlamaIndex and Nile.

The
[Python script](https://github.com/niledatabase/niledatabase/blob/main/examples/integrations/code_snippets/nile_llamaindex_quickstart.py)
and [iPython/Jupyter Notebook](https://github.com/niledatabase/niledatabase/blob/main/examples/integrations/code_snippets/nile_llamaindex_quickstart.ipynb) include all the code for this quickstart,
as well as few additional examples - for example, how to use metadata filters to farther restrict the search results, or how to delete documents from the vector store.

## Full application

Ready to build something amazing? Check out our [TaskGenius example](https://github.com/niledatabase/niledatabase/tree/main/examples/ai/local_llama_task_genius).

The README includes step by step instructions on how to run the application locally.

Lets go over a few of the code highlights:

### Use of LlamaIndex with FastAPI

The example is a full-stack application with a FastAPI back-end and a React front-end.

When we initialize FastAPI, we create an instance of our `AIUtils` class, which is responsible for interfacing with the Nile vector store and the OpenAI API.

```python
@asynccontextmanager
async def lifespan(app: FastAPI):
    # Initialize AIUtils - it is a singleton, so we are just saving on initialization time
    AIUtils()
    yield
    # Cleanup
    ai_utils = None

app = FastAPI(lifespan=lifespan)
```

Then, in the handler for the `POST /api/todos` endpoint, we use the `AIUtils` instance to generate a time estimate for the todo item, and to store the embedding in the Nile vector store.

```python
@app.post("/api/todos")
async def create_todo(todo: Todo, request: Request, session = Depends(get_tenant_session)):
    ai_utils = AIUtils() # get an instance of AIUtils
    todo.tenant_id = get_tenant_id();
    todo.id = str(uuid4())
    estimate = ai_utils.ai_estimate(todo.title, todo.tenant_id)
    todo.estimate = estimate
    ai_utils.store_embedding(todo)
    session.add(todo)
    session.commit()
    return todo
```

### Using LlamaIndex with Ollama

If you look at the `AIUtils` class, you'll see that it is very similar to the simple quickstart example earlier. Except we use Ollama instead of OpenAI.

We initialize the vector store and the index in the `__init__` method:

```python
# Initialize settings and vector store once
Settings.embed_model = OllamaEmbedding(model_name="nomic-embed-text")
Settings.llm = Ollama(model="llama3.2", request_timeout=360.0)

self.vector_store = NileVectorStore(
    service_url=os.getenv("DATABASE_URL"),
    table_name="todos_embedding",
    tenant_aware=True,
    num_dimensions=768
)
self.index = VectorStoreIndex.from_vector_store(self.vector_store)
```

Then to store the embedding in the Nile vector store, we do exactly what we did in the quickstart example - enrich the todo item with the tenant ID and insert it into the index:

```python
document = Document(
    text=f"{todo.title} is estimated to take {todo.estimate} to complete",
    id_=str(todo.id),
    metadata={"tenant_id": str(todo.tenant_id)}
)
```

To get an estimate, we create a query engine for the tenant and use it to query the index, just like we did in the quickstart example:

```python
query_engine = self.index.as_query_engine(vector_store_kwargs={
    "tenant_id": str(tenant_id),
})

response = query_engine.query(
    f'you are an amazing project manager. I need to {text}. How long do you think this will take? '
    f'respond with just the estimate, no yapping.'
)
```

### Using FastAPI with Nile for tenant isolation

If you look at the `POST /api/todos` handler, you'll see that we get all the todos for a tenant without needing to do any filtering.
This is because the `get_tenant_session` function returns a session for the tenant database:

```python
@app.get("/api/todos")
async def get_todos(session = Depends(get_tenant_session)):
    results = session.exec(select(Todo.id, Todo.tenant_id,Todo.title, Todo.estimate, Todo.complete)).all()
    return results
```

`get_tenant_session` is implemented in `db.py` and it is a wrapper around the `get_session` function that sets the `nile.tenant_id` and `nile.use_id` context.

```python
def get_tenant_session():
    session = Session(bind=engine)
    try:
        tenant_id = get_tenant_id()
        user_id = get_user_id()
        session.execute(text(f"SET LOCAL nile.tenant_id='{tenant_id}';"))
        # This will raise an error if user_id doesn't exist or doesn't have access to the tenant DB.
        session.execute(text(f"SET LOCAL nile.user_id='{user_id}';"))
        yield session
```

The tenant_id and user_id are set in the context by our custom FastAPI middleware. It extracts the tenant ID from the request headers and user token from the cookies.

You can see it in `tenant_middleware.py`.

```python
class TenantAwareMiddleware:
    def __init__(self, app):
        self.app = app

    async def __call__(self, scope, receive, send):
        headers = Headers(scope=scope)
        maybe_tenant_id = headers.get("X-Tenant-Id")
        maybe_set_context(maybe_tenant_id, tenant_id)
        request = Request(scope)
        token = request.cookies.get("access_token")
        maybe_user_id = get_user_id_from_valid_token(token)
        maybe_set_context(maybe_user_id, user_id)
        await self.app(scope, receive, send)
```

### Summary

This example shows how to use LlamaIndex with Nile to build a RAG application. It demonstrates how to store and query documents in a tenant-aware vector store, and how to use
metadata filters to further restrict the search results. It also shows how to use FastAPI with Nile to build a full-stack application with tenant isolation.
